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TITLE OF THE INVENTION 

METHOD AND APPARATUS FOR SPEECH RECOGNITION 

CROSS-REFERENCE TO RELATED APPLICATIONS 

[0001] This application claims the priority of Korean Patent Application No. 2002-87943, filed 
on December 31, 2002, in the Korean Intellectual Property Office, the disclosure of which is 
incorporated herein in its entirety by reference. 

BACKGROUND OF THE INVENTION 

1. Field of the Invention 

[0002] The present invention relates to speech recognition, and more particularly, to a 
method and apparatus for enhancing the performance of speech recognition by adaptively 
changing a process of determining a final, recognized word depending on a user's selection in a 
list of alternative words represented by a result of speech recognition. 

2. Description of the Related Art 

[0003] Speech recognition refers to a technique by which a computer analyzes and 
recognizes or understands human speech. Human speech sounds have specific frequencies 
according to the shape of a human mouth and positions of a human tongue during utterance. In 
other words, in speech recognition technology, human speech sounds are converted into 
electric signals and frequency characteristics of the speech sounds are extracted from the 
electric signals, in order to recognize human utterances. Such speech recognition technology is 
adopted in a wide variety of fields such as telephone dialing, control of electronic toys, language 
learning, control of electric home appliances, and so forth. 

[0004] Despite the advancement of speech recognition technology, speech recognition 
cannot yet be fully accomplished due to background noise or the like in an actual speech 
recognition environment. Thus, errors frequently occur in speech recognition tasks. In order to 
reduce the probability of the occurrence of such errors, there are employed methods of 
determining a final, recognized word depending on user confirmation or selection by requesting 
the user to confirm recognition results of a speech recognizer or by presenting the user with a 
list of alternative words derived from the recognition results of the speech recognizer. 
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[0005] Conventional techniques associated with the above methods are disclosed in U.S. 
Patent. Nos. 4.866,778, 5,027.406, 5.884.258, 6,314.397. 6.347.296. and so on. U.S. Patent. 
No. 4.866,778 suggests a technique by which the most effectively searched probable alternative 
word is displayed and if the probable alternative word is wrong, the next alternative word is 
displayed to find the correct recognition result According to this technique, a user must 
separately answer a series of YES/NO questions presented by a speech recognition system 
and cannot predict which words will appear in the next question. U.S. Patent. Nos. 5,027,406 
and 5,884,258 present a technique by which alternative words derived from speech recognition 
are arrayed and recognition results are determined depending on user's selections from the 
alternative words via a graphic user interface or voice. According to this technique, since the 
user must perform additional manipulations to select the correct alternative word in each case 
after he or she speaks, he or she experiences inconvenience and is tired of the iterative 
operations. U.S. Patent. No. 6,314,397 shows a technique by which a user's utterances are 
converted into texts based on the best recognition results and corrected through a user review 
during which an alternative word is selected from a list of alternative words derived from 
previously considered recognition results. This technique suggests a smooth speech 
recognition task. However, when the user uses a speech recognition system in real time, the 
user must create a sentence, viewing recognition results. U.S. Patent. No. 6,347,296 discloses 
a technique by which during a series of speech recognition tasks, an indefinite recognition result 
of a specific utterance is settled by automatically selecting an alternative word from a list of 
alternative words with reference to a recognition result of a subsequent utterance. 

[0006] As described above, according to conventional speech recognition technology, 
although a correct recognition result of user speech is obtained, an additional task such as user 
confirmation or selection must be perfomied at least once. In addition, when the user 
confirmation is not perfomned, an unlimited amount of time is taken to determine a final, 
recognized word. 

SUMMARY OF THE INVENTION 

[0007] The present invention provides a speech recognition method of determining a first 
alternative word as a final, recognized word after a predetermined standby time in a case where 
a user does not select an alternative word from a list of alternative words derived from speech 
recognition of the user's utterance, determining a selected alternative word as a final. 
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recx)gnized word when the user selects the alternative word, or determining an alternative word 
selected after an adjusted standby time as a final, recognized word. 

[0008] The present invention also provides an apparatus for performing the speech 
recognition method. 

[0009] According to an aspect of the present invention, there is provided a speech 
recognition method comprising: inputting speech uttered by a user; recognizing the input speech 
and creating a predetermined number of alternative words to be recognized in order of 
similarity; and displaying a list of alternative words arranged in a predetermined order and 
determining an alternative word that a cursor currently indicates as a final, recognized word if a 
user's selection from the list of alternative words has not been changed within a predetenmined 
standby time. 

[0010] Preferably, but not required, the speech recognition method further comprises 
adjusting the predetermined standby time and returning to the determination whether the user's 
selection from the list of alternative words has been changed within the predetermined standby 
time, if the user's selection has been changed within the predetermined standby time. The 
speech recognition method further comprises determining an alternative word from the list of 
alternative words that is selected by the user as a final, recognized word, if the user's selection 
is changed within the predetermined standby time. 

[0011] According to another aspect of the present invention, there is provided a speech 
recognition apparatus comprising: a speech input unit that inputs speech uttered by a user; a 
speech recognizer that recognizes the speech input from the speech input unit and creates a 
predetermined number of alternative words to be recognized in order of similarity; and a post- 
processor that displays a list of alternative words arranged in a predetermined order and 
determines an alternative word that a cursor currently indicates as a final, recognized word if a 
user's selection from the list of alternative words has not been changed within a predetermined 
standby time. 

[0012] Preferably, the post-processor comprises: a window generator that generates a 
window for a graphic user interface comprising the list of alternative words; a standby time 
setter that sets a standby time from when the window is displayed to when the alternative word 
on the list of alternative words currently indicated by the cursor is determined as the final, 
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recognized word; and a final, recognized word determiner that determines a first alternative 
word from the list of alternative words that is cunrently indicated by the cursor as a final, 
recognized word if the user's selection from the list of alternative words has not been changed 
within the predetermined standby time, adjusts the predetenmined standby time if the user's 
selection from the list of alternative words has been changed within the predetermined standby 
time, and determines an alternative word on the list of alternative words selected by the user as 
a final, recognized word if the user's selection has not been changed within the adjusted 
standby time. 

[0013] Preferably, but not required, the post-processor comprises: a window generator that 
generates a window for a graphic user interface comprising a list of alternative words that 
arranges the predetermined number of alternative words in a predetermined order; a standby 
time setter that sets a standby time from when the window is displayed to when an alternative 
word on the list of alternative words currently indicated by the cursor is determined as a final, 
recognized word; and a final, recognized word determiner that determines a first alternative 
word on the list of alternative words currently indicated by the cursor as a final, recognized word 
if a user's selection from the list of alternative words has not been changed within the standby 
time and determines an alternative word on the list of alternative words selected by the user as 
a final, recognized word if the user's selection from the list of alternative words has been 
changed. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0014] The above/or and other features and advantages of the present invention will become 
more apparent by describing in detail exemplary embodiments thereof with reference to the 
attached drawings in which: 

FIG. 1 is a block diagram of a speech recognition apparatus according to an 
embodiment of the present invention; 

FIG. 2 is a detailed block diagram of a post-processor of FIG. 1 ; 

FIG. 3 is a flowchart for explaining a process of updating an erroneous word pattern 
database (DB) by an erroneous word pattern manager of FIG. 2; 

FIG. 4 is a table showing an example of the erroneous word pattern DB of FIG. 2; 

FIG. 5 is a flowchart explaining a process of changing the order of the arrangement of 
alternative words by the erroneous word pattern manager of FIG. 2; 
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FIG. 6 is a flowchart explaining a process of adjusting a standby time by a dexterity 
manager of FIG. 2; 

FIG. 7 is a flowchart explaining a speech recognition method, according to an 
embodiment of the present invention; 

FIG. 8 is a flowchart explaining a speech recognition method, according to another 
embodiment of the present invention; and 

FIG. 9 shows an example of a graphic user interface according to the present invention. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

[0015] Hereinafter, embodiments of the present invention will be described in detail with 
reference to the attached drawings. 

[0016] FIG. 1 is a block diagram of a speech recognition apparatus according to an 
embodiment of the present invention. Referring to FIG. 1 , the speech recognition apparatus 
includes a speech input unit 11, a speech recognizer 13, and a post-processor 15. 

[0017] The speech input unit 11 includes microphones and so forth, receives speech from a 
user, removes a noise signal from the user speech, amplifies the user speech to a 
predetermined level, and transmits the user speech to the speech recognizer 13. 

[0018] The speech recognizer 13 detects a starting point and an ending point of the user 
speech, samples speech feature data from sound sections except soundless sections before 
and after the user speech, and vector-quantizes the speech feature data in real-time. Next, the 
speech recognizer 13 performs a viterbi search to choose the closest acoustic word to the user 
speech from words stored in a DB using the speech feature data. To this end, Hidden Markov 
Models (HMMs) may be used. Feature data of HMMs. which are built by training words, is 
compared with that of currently input speech and the difference between the two feature data is 
used to determine the most probable candidate word. The speech recognizer 13 completes the 
viterbi search, determines a predetermined number, for example. 3 of the closest acoustic 
words to currently input speech as recognition results in the order of similarity, and transmits the 
recognition results to the post-processor 15. 

[0019] The post-processor 15 receives the recognition results from the speech recognizer 13, 
converts the recognition results into text signals, and creates a window for a graphic user 
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interface. Here, the window displays the text signals in the order of similarity. An example of 
the window is shown in FIG. 9. As shown in FIG. 9, a window 91 includes a message area 92 
for displaying a message "A first alternative word, herein. Tam Saek Gi'. is being recognized", 
an area 93 for displaying a time bar, and an area 94 for displaying a list of alternative words. 
The window 91 is displayed on a screen until the time belt 93 corresponding to a predetermined 
standby time is over. In addition, in a case where there is no additional user input with an 
alternative word selection key or button within the standby time, the first alternative word is 
determined as a final, recognized word. On the contrary, when there is an additional user input 
with the alternative word selection key or button within the standby time, a final, recognized 
word is determined through a process shown in FIG. 7 or 8 which will be explained later. 

[0020] FIG. 2 is a detailed block diagram of the post-processor 15 of FIG. 1. Referring to 
FIG. 2, the post-processor 15 includes a standby time setter 21, a dexterity manager 22. a 
dexterity DB 23, a window generator 24, an erroneous word pattern manager 25, an erroneous 
word pattern DB 26. and a final, recognized word determiner 27. 

[0021] The standby time setter 21 sets a standby time from a point in time when the window 
91 for the graphic user interface is displayed to a point in time when an alternative word 
currently indicated by a cursor is determined as a final, recognized word. The standby time is 
represented by the time bar 93 in the window 91 . The standby time may be equally assigned to 
all of the alternative words on the list of alternative words. The standby time may also be 
assigned differentially to each of the alternative words from the most acoustically similar 
alternative word to the least acoustically similar alternative word. The standby time may be 
equally assigned to users or may be assigned differentially to the user depending on the user's 
dexterity at handling the speech recognition apparatus. The standby time setter 21 provides the 
window generator 24 with the standby time and the recognition results input from the speech 
recognizer 13. 

[0022] The dexterity manager 22 adds a predetermined spare time to a selection time 
determined based on information on a user's dexterity stored in the dexterity DB 23, adjusts the 
standby time to the addition value, and provides the standby time setter 21 with the adjusted 
standby time. Here, the dexterity manager 22 adjusts the standby time through a process 
shown in FIG. 6 which will be explained later. In addition, the standby time may be equally 
assigned to all of the alternative words or may be assigned differentially to each of the 
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alternative words from the most acx)ustically similar alternative word to the least acoustically 
similar alternative word. 

[0023] The dexterity DB 23 stores different selection times determined based on the user's 
dexterity. Here, 'dexterity* is a variable in inverse proportion to a selection time required for 
detemiining a final, recognized word after the window for the graphic user interface is displayed. 
In other words, an average value of selection times required for a predetermined number of 
times a final, recognized word is detenmined as the user's dexterity. 

[0024] The window generator 24 generates the window 91 including the message area 92, 
the time belt 93. and the alternative word list 94 as shown in FIG. 9. The message area 92 
displays a current situation, the time belt 93 corresponds to the standby time, set by the standby 
time setter 21, and the alternative word list 94 lists the recognition results, i.e.. the alternative 
words, in the order of similarity. Here, the order of listing the alternative words may be 
determined based on erroneous word patterns appearing in previous speech recognition history 
as well as the similarity. 

[0025] The erroneous word pattern manager 25 receives a recognized word determined as 
the first alternative word by the speech recognizer 13 and the final, recognized word provided by 
the final, recognized word determiner 27. If the erroneous word pattern DB 26 stores the 
combination of the recognized words corresponding to the first alternative word and the final, 
recognized word, the erroneous word pattern manager 25 adjusts scores of the recognition 
results supplied from the speech recognizer 13 via the standby time setter 21 and the window 
generator 24 to be provided to the window generator 24. The window generator 24 then 
changes the listing order of the alternative word list 94 based on the adjusted scores. For 
example, if "U hui-jin" is determined as the first alternative word and "U ri jib" is detennined as 
the final, recognized word, predetermined weight is laid on "U ri jib". As a result, although the 
speech recognizer 13 determines "U hul jin" as the first alternative word, the window generator 
24 can array "U ri jib" in a higher position than "U hui jin". 

[0026] When the first alternative word and the final, recognized word are different, the 
erroneous word pattern DB 26 stores the first alternative word and the final, recognized word as 
erroneous word patterns. As shown in FIG. 4, an erroneous word pattern table includes a first 
alternative word 41 resulting from speech recognition, a final, recognized word 42, first through 
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nth user utterance features 43, an utterance propensity 43. and a number of tinnes en^ors occur, 
i.e., a history n 45. 

[0027] The final, recognized word determiner 27 determines the final, recognized word 
depending on whether the user makes an additional selection from the alternative word list 94 in 
the window 91 within the standby time represented by the time belt 93. In other words, when 
the user does not additionally press the alternative word selection key or button within the 
standby time after the window 91 is displayed, the final, recognized word determiner 27 
detennines the first alternative word currently indicated by the cursor as the final, recognized 
word. When the user presses the alternative word selection key or button within the standby 
time, the final, recognized word determiner 27 determines the final, recognized word through the 
process of FIG. 7 or 8. 

[0028] FIG. 3 is a flowchart for explaining an updating process of the erroneous word pattern 
DB 26 by the erroneous word pattern manager 25 of FIG. 2. Refenring to FIG. 3, in operation 
31 , a determination is made as to whether the enroneous word pattern DB 26 stores a pair of the 
first alternative word and the final, recognized word provided by the final, recognized word 
determiner 27. If in operation 301 , it is detemiined that the erroneous word pattern DB 26 does 
not store the pair of the first alternative word and the final, recognized word, the process ends. 

[0029] If in operation 31 , it is detemiined that the erroneous word pattern DB 26 stores the 
pair of the first alternative word and the final, recognized word, in operation 32, a difference 
value in utterance features is calculated. The difference value is a value obtained by adding 
absolute values of differences between the first through nth utterance features 43 of the 
erroneous word patterns stored in the erroneous word pattern DB 26 and first through nth 
utterance features of currently input speech. 

[0030] In operation 33, the difference value is compared with a first threshold, that is, a 
predetermined reference value for update. The first threshold may be set to an optimum value 
experimentally or through a simulation. If in operation 33, it is determined that the difference 
value is greater than or equal to the first threshold, the process ends. If in operation 33, the 
difference value is less than the first threshold, i.e., if it is determined that an error occurs for 
reasons such as a cold, voice change in the mornings, background noise, or the like, in 
operation 34, respective average values of the first through nth utterance features including 
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those of the currently input speech are calculated to update the utterance propensity 44. In 
operation 35. a value of a history n is increased by 1 to update the history 45. 

[0031] FIG. 5 is a flowchart explaining a process of changing an order of listing alternative 
words via the en^oneous word pattern manager 25 of FIG. 2. Refemng to FIG. 5, in operation 
51 . a determination is made as to whether the erroneous word pattern DB 26 stores a pair of a 
first alternative word and a second alternative word as the final, recognized word or a pair of a 
first alternative word and a third alternative word as the final, recognized word, with reference to 
the recognition results and scores of Table 1 provided to the window generator 24 via the 
speech recognizer 13. If in operation 51. it is determined that the erroneous word pattern DB 26 
does not store the pair of the first alternative word and the second alternative word or the pair of 
the first alternative word and the third alternative word, the process ends. Here. Table 1 shows 
scores of first to third alternate words. 



rable 1] 



Recognition Result 


Scores 


Hwang Gil Du 


10 


Hong Gi Su 


9 


Hong Gil Dong 


8 



[0032] If in operation 51 . it is determined that the erroneous word pattern DB 26 stores the 
pair of the first alternative word and the second alternative word or the pair of the first alternative 
word and the third alternative word, in operation 52, a difference value in first through n*^ 
utterance features is calculated. As described with reference to FIG. 3, the difference value is a 
value obtained by adding absolute values of differences between the first through n*^ utterance 
features stored in the erroneous word pattern DB 26 and first through n*^ utterance features of 
currently input speech, with regard to each pair. 

[0033] In operation 53, the difference value is compared with a second threshold, that is. a 
predetermined reference value for changing the order of listing the alternative words. The 
second threshold may be set to an optimum value experimentally or through a simulation. If in 
operation 53, it is determined that the difference value in each pair is greater than or equal to 
the second threshold, i.e.. an error does not occur for the same reason as the erroneous word 
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pattern, the process ends. If in operation 53. it is detemriined that the difference value in each 
pair is less than the second threshold, i.e., the error occurs for the same reason as the 
erroneous word pattern, in operation 54, a score of a con'esponding alternative word is 
adjusted. For example, in a case where the en^oneous word pattern DB 26 stores an erroneous 
word pattern table as shown in FIG. 4, that is, the pair of the first alternative word and the third 
alternative word as a final, recognized word and the weight is set to 0.4, the recognition results 
and the scores shown in Table 1 are changed into recognition results and scores shown in Table 
2. Here, a changed score "9.2" is obtained by adding a value resulting from multiplication of the 
weight "0.4" by a history "3" to an original score "8". 



[Table 2] 



Recognition Result 


Score 


Hwang Gil Du 


10 


Hong Gi Su 


9.2 


Hong Gil Dong 


9 



[0034] Meanwhile, the first through nth utterance features 43 used in the processes shown in 
FIGS. 3 through 5 are information generated when the speech recognizer 13 analyzes speech. 
In other words, the information may be information, a portion of which is used for determining 
speech recognition results and a remaining portion of which is used only as reference data. The 
information may also be measured using additional methods as follows. 

[0035] First, a time required for uttering a corresponding number of syllables is defined as an 
utterance speed. Next, a voice tone is defined. When the voice tone is excessively lower or 
higher than a microphone volume set in hardware, the voice tone may be the cause of an error. 
For example, a low-pitched voice is hidden by noise and a high-pitched voice is not partially 
received by hardware. As a result, a voice signal may be distorted. Third, a basic noise level, 
which is measured in a state when no voice signal is input or in a space between syllables, is 
defined as a signal-to-noise ratio (SNR). Finally, the change of voice is defined in a specific 
situation where a portion of voice varies due to cold or a problem with the vocal chords that 
occurs in the mornings. In addition, various other utterance features may be used. 
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[0036] FIG. 6 is a flowchart explaining a process of adjusting the standby time via the 
dexterity manager 22 of FIG. 2. Referring to FIG. 6, in operation 61, a difference value in a ' 
selection time is calculated by subtracting a time required for determining a final, recognized 
word from the selection time assigned as an initialization value stored in the dexterity DB 23. In 
operation 62, the difference value is compared with a third threshold, that is. a predetermined 
reference value for changing the standby time. The third threshold may be set to an optimum 
value experimentally or through a simulation. If in operation 62, it is determined that the 
difference value is greater than the third threshold, i.e., a given time is longer than a time for 
which the user can determine a selection, in operation 63, the selection time is modified. The 
modified selection time is calculated by subtracting a value resulting from multiplying the 
difference value by the predetermined weight from the selection time stored in the dexterity DB 
23, For example, when the selection time stored in the dexterity DB 23 is 0.8 seconds, the 
difference value is 0.1 seconds, and the predetermined weight is 0.1, the modified selection 
time is 0.79 seconds. The modified selection time is stored in the dexterity DB 23 so as to 
update a selection time for the user. 

[0037] If in operation 62, it is determined that the difference value is less than or equal to the 
third threshold, i.e., a final user's selection is determined by a timeout of the speech recognition 
system after the selection time ends, in operation 64, the difference value is compared with a 
predetermined spare time. If in operation 64. it is determined that the difference value is greater 
than or equal to the spare time, the process ends. 

[0038] If in operation 64, it is determined that the difference value is less than the spare time, 
in operation 65, the selection time is modified. The modified selection time is calculated by 
adding a predetermined extra time to the selection time stored in the dexterity DB 23. For 
example, when the selection time stored in the dexterity DB 23 is 0.8 seconds and the extra 
time is 0.02 seconds, the modified selection time is 0.82 seconds. The modified selection time 
is stored in the dexterity DB 23 so as to update a selection time for a user. The extra time is to 
prevent a potential error from occurring in a subsequent speech recognition process, and 
herein, is set to 0.02 seconds. 

[0039] In operation 66. a standby time for the user is calculated by adding a predetermined 
amount of extra time to the selection time modified in operation 63 or 65, and the standby time 
setter 21 is informed of the calculated standby time. Here, the extra time is to prevent a user*s 
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selection from being determined regardless of the user's intension, and herein, is set to 0.3 
seconds. 

[0040] FIG. 7 is a flowchart for explaining a speech recognition method, according to an 
embodiment of the present invention. The speech recognition method includes operation 71 of 
displaying a list of alternative words, operations 72, 73, and 78 performed in a case of no 
change in a user's selection, and operations 74, 75, 76, and 77 performed in a case where a 
user's selection changes. 

[0041] Referring to FIG, 7, in operation 71, the window 91 including the alternative word list 
94 listing the recognition results of the speech recognizer 13 is displayed. In the present 
invention, the cursor is set to indicate the first alternative word on the alternative word list 94 
when the window 91 is displayed. In addition, the time belt 93 starts from when the window 91 
is displayed. In operation 72, a detenmination is made as to whether an initial standby time set 
by the standby time setter 21 has elapsed without additional user input with the alternative word 
selection key or button. 

[0042] If in operation 72. it is determined that the initial standby time has elapsed, in 
operation 73. a first alternative word currently indicated by the cursor is determined as a final, 
recognized word. In operation 78. a function corresponding to the final, recognized words is 
performed. If in operation 72. it is determined that the initial standby time has not elapsed, in 
operation 74, a determination is made as to whether a user's selection has been changed by 
the additional user input with the alternative word selection key or button. 

[0043] If in operation 74, it is determined that the user's selection has been changed, in 
operation 75. the initial standby time is reset. Here, the adjusted standby time may be equal to 
or different from the initial standby time according to an order of listing the alternative words. If 
in operation 74, it is determined that the user's selection has not been changed, the process 
moves on to operation 76. For example, if the user's selection is changed into Tan Seong Ju 
Gi' shown in FIG. 9, the message area 92 of the window 91 shows a message "The Tan Seong 
Ju Gi' is being recognized" and concurrently, the time belt 93 starts according to the adjusted 
standby time. 

[0044] In operation 76, a determination is made as to whether the standby time adjusted in 
operation 75 or the initial standby time has elapsed. If in operation 76. it is determined that the 



12 



Docket No.: 1793.1160 



adjusted standby time or the initial standby time has not elapsed, the process returns to 
operation 74 to iteratively determine whether the user's selection has been changed. If in 
operation 76, it is detemiined that the adjusted standby time or the initial standby time has 
elapsed, in operation 77, an alternative word that the cursor currently indicates from a change in 
the user's selection is determined as a final, recognized word. In operation 78, a function 
corresponding to the final, recognized word is performed. 

[0045] FIG. 8 is a flowchart for explaining a speech recognition method, according to another 
embodiment of the present invention. The speech recognition method includes operation 81 of 
displaying a list of alternative words, operations 82. 83, and 86 performed in a case where there 
is no change in a user's selection, and operations 84, 85, and 86 performed in a case where 
there is a change in a user's selection. 

[0046] Referring to FIG. 8, in operation 81. the window 91 including the alternative word list 
94 listing the recognition results of the speech recognizer 13 is displayed. The time belt 93 
starts from when the window 91 is displayed. In operation 82. a determination is made as to 
whether an initial standby time set by the standby time setter 21 has elapsed without an 
additional user input via the alternative word selection key or button. 

[0047] If in operation 82, it is detemiined that the initial standby time has elapsed, in 
operation 83, a first alternative word currently indicated by the cursor is detemnined as a final, 
recognized word. In operation 86, a function corresponding to the final, recognized word is 
performed. If in operation 82. it is determined that the initial standby time has not elapsed, in 
operation 84, a determination is made as to whether a user's selection has been changed by 
the additional user input via the alternative word key or button. If in operation 84, it is 
determined that the user's selection has been changed, in operation 85. an alternative word that 
the cursor currently indicates due to the change in the user's selection is detemiined as the 
final, recognized word. In operation 86, a function corresponding to the final, recognized word is 
performed. If in operation 84, it is determined that the user's selection has not been changed, 
the process returns to operation 82. 

[0048] Table 3 below shows comparisons of existing speech recognition methods and a 
speech recognition method of the present invention, in respect to success rates of speech 
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recognition tasks and a number of times an additional task is perfomned in various recognition 
environments. 



Uable 3] 



Suggestion 
Method of 
Alternative 
Word 


90% Recognition Environment 


70% Recognition Environment 


AaQlilOnai 

Task 
0 Time 


AOQiiionai 
Task 
1 Time 


MuQiiionai 
Task 
2 Time 


Total 


MuulllQllal 

Task 
0 Time 


Arlrlifrionsl 
r\UUIllUI Idl 

Task 
1 Time 


Task 
2 Time 


Total 


Existing 
method 1 


90% 


0% 


0% 


90% 


70% 


0% 


0% 


70% 


Existing 
method 2 


0% 


90% 


0% 


90% 


0% 


70% 


0% 


70% 


Existing 
method 3 


0% 


99.9% 


0% 


99.9% 


0% 


97.3% 


0% 


97.3% 


Present 
Invention 


90% 


9% 


0.9% 


99.9% 


70% 


21% 


6.3% 


97.3% 



[0049] Alternative words are suggested in existing method 1 . In existing method 2. a user 
determines the best alternative word. In existing method 3, a user selects an alternative word 
from a list of alternative words corresponding to recognition results. Also, data shown in Table 3 
was obtained on the assumption that 90% recognition environment refers to noise in an office, 
70% recognition environment refers to noise where a car travels on a highway, and a list of 
alternative words to be recognized is infinite, the alternative words on the list of alternative 
words are similar. According to Table 3, when the speech recognition method of the present 
invention is adopted, as the additional task is iteratively performed, the success rate of the 
speech recognition task is maximized. 

[0050] As described above, in a speech recognition method and apparatus according to the 
present invention, a number of times a user performs an additional task and psychological 
pressure placed on the user can be minimized, even in a poor speech recognition environment, 
and a final success rate of speech recognition performed via a voice command can be 
maximized. As a result, efficiency of speech recognition can be improved. 
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[0051] In addition, when a user's selection is not changed within a predeternnined standby 
time, a subsequent task can be autonnatically performed. Thus, a number of times the user 
manipulates a button for speech recognition can be minimized. As a result, since the user can 
easily perform speech recognition, user satisfaction of a speech recognition system can be 
increased. Moreover, a standby time can be adaptively adjusted to a user. Thus, a speed for 
performing speech recognition tasks can be reduced. 

[0052] The present invention can be realized as a computer-readable code on a computer- 
readable recording medium. For example, a speech recognition method can be accomplished 
as first and second programs recorded on a computer-readable recording medium. The first 
program includes recognizing speech uttered by a user and displaying a list of alternative words 
listing a predetermined number of recognition results in a predetermined order. The second 
program includes determining whether a user's selection from the list of alternative words has 
changed within a predetermined standby time; if the user's selection has not changed within the 
predetermined standby time, determining an alternative word from the list of the alternative 
words currently indicated by a cursor, as a final, recognized word; if the user's selection has 
been changed within the predetermined standby time, the standby time is adjusted; iteratively 
detemnining whether the user's selection has changed within the adjusted standby time; and if 
the user's selection has not changed within the adjusted standby time, determining an 
alternative word selected by the user as a final, recognized word. Here, the second program 
may be replaced with a program including determining whether a user's selection from a list of 
alternative words has changed within a predetermined standby time; if the user's selection has 
not changed within the predetermined standby time, determining an alternative word from the 
list of alternative words currently indicated by a cursor, as a final, recognized word; and if the 
user's selection has been changed within the predeternnined standby time, determining an 
alternative word selected by the user as a final, recognized word. 

[0053] A computer-readable medium device may be any kind of recording medium in which 
computer-readable data is stored. Examples of such computer-readable media devices include 
ROMs, RAMs, CD-ROMs, magnetic tapes, floppy discs, optical data storing devices, and carrier 
waves (e.g.. transmission via the Internet), and so forth. Also, the computer-readable code can 
be stored on the computer-readable media distributed in computers connected via a network. 
Furthermore, functional programs, codes, and code segments for realizing the present invention 
can be easily analogized by programmers skilled in the art. 
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[0054] Moreover, a speech recognition method and apparatus according to the present 
invention can be applied to various platforms of personal mobile communication devices such 
as personal computers, portable phones, personal digital assistants (PDA), and so forth. As a 
result, success rates of speech recognition tasks can be improved. 

[0055] While the present invention has been particularly shown and described with reference 
to exemplary embodiments thereof, it will be understood by those of ordinary skill in the art that 
various changes in fonn and details may be made therein without departing from the spirit and 
scope of the present invention as defined by the following claims. 
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